16 research outputs found

    Detection and Evaluation of bias-inducing Features in Machine learning

    Full text link
    The cause-to-effect analysis can help us decompose all the likely causes of a problem, such as an undesirable business situation or unintended harm to the individual(s). This implies that we can identify how the problems are inherited, rank the causes to help prioritize fixes, simplify a complex problem and visualize them. In the context of machine learning (ML), one can use cause-to-effect analysis to understand the reason for the biased behavior of the system. For example, we can examine the root causes of biases by checking each feature for a potential cause of bias in the model. To approach this, one can apply small changes to a given feature or a pair of features in the data, following some guidelines and observing how it impacts the decision made by the model (i.e., model prediction). Therefore, we can use cause-to-effect analysis to identify the potential bias-inducing features, even when these features are originally are unknown. This is important since most current methods require a pre-identification of sensitive features for bias assessment and can actually miss other relevant bias-inducing features, which is why systematic identification of such features is necessary. Moreover, it often occurs that to achieve an equitable outcome, one has to take into account sensitive features in the model decision. Therefore, it should be up to the domain experts to decide based on their knowledge of the context of a decision whether bias induced by specific features is acceptable or not. In this study, we propose an approach for systematically identifying all bias-inducing features of a model to help support the decision-making of domain experts. We evaluated our technique using four well-known datasets to showcase how our contribution can help spearhead the standard procedure when developing, testing, maintaining, and deploying fair/equitable machine learning systems.Comment: 65 pages, manuscript accepted at EMSE journal, manuscript number, EMSE-D-22-00330R

    Are Multi-language Design Smells Fault-prone? An Empirical Study

    Full text link
    Nowadays, modern applications are developed using components written in different programming languages. These systems introduce several advantages. However, as the number of languages increases, so does the challenges related to the development and maintenance of these systems. In such situations, developers may introduce design smells (i.e., anti-patterns and code smells) which are symptoms of poor design and implementation choices. Design smells are defined as poor design and coding choices that can negatively impact the quality of a software program despite satisfying functional requirements. Studies on mono-language systems suggest that the presence of design smells affects code comprehension, thus making systems harder to maintain. However, these studies target only mono-language systems and do not consider the interaction between different programming languages. In this paper, we present an approach to detect multi-language design smells in the context of JNI systems. We then investigate the prevalence of those design smells. Specifically, we detect 15 design smells in 98 releases of nine open-source JNI projects. Our results show that the design smells are prevalent in the selected projects and persist throughout the releases of the systems. We observe that in the analyzed systems, 33.95% of the files involving communications between Java and C/C++ contains occurrences of multi-language design smells. Some kinds of smells are more prevalent than others, e.g., Unused Parameters, Too Much Scattering, Unused Method Declaration. Our results suggest that files with multi-language design smells can often be more associated with bugs than files without these smells, and that specific smells are more correlated to fault-proneness than others

    Reuse and maintenance practices among divergent forks in three software ecosystems

    Get PDF
    With the rise of social coding platforms that rely on distributed version control systems, software reuse is also on the rise. Many software developers leverage this reuse by creating variants through forking, to account for different customer needs, markets, or environments. Forked variants then form a so-called software family; they share a common code base and are maintained in parallel by same or different developers. As such, software families can easily arise within software ecosystems, which are large collections of interdependent software components maintained by communities of collaborating contributors. However, little is known about the existence and characteristics of such families within ecosystems, especially about their maintenance practices. Improving our empirical understanding of such families will help build better tools for maintaining and evolving such families. We empirically explore maintenance practices in such fork-based software families within ecosystems of open-source software. Our focus is on three of the largest software ecosystems existence today: Android,.NET, and JavaScript. We identify and analyze software families that are maintained together and that exist both on the official distribution platform (Google play, nuget, and npm) as well as on GitHub , allowing us to analyze reuse practices in depth. We mine and identify 38 software families, 526 software families, and 8,837 software families from the ecosystems of Android,.NET, and JavaScript, to study their characteristics and code-propagation practices. We provide scripts for analyzing code integration within our families. Interestingly, our results show that there is little code integration across the studied software families from the three ecosystems. Our studied families also show that techniques of direct integration using git outside of GitHub is more commonly used than GitHub pull requests. Overall, we hope to raise awareness about the existence of software families within larger ecosystems of software, calling for further research and better tools support to effectively maintain and evolve them

    An Empirical Study of Testing and Release Practices for Machine Learning Software Systems

    No full text
    RÉSUMÉ : Nous assistons à une adoption croissante des algorithmes d’apprentissage automatique et d’apprentissage profond dans de nombreux systèmes logiciels, y compris dans des domaines critiques telque la santé et les transports. D’une part, assurer la qualité logicielle de ces systèmes est encore un défi ouvert pour la communauté des chercheurs, principalement en raison de la nature inductive de l’apprentissage automatique. Mais, d’un autre côté, les équipes d’ingénierie et de mise en production de systèmes intégrant l’intelligence artificielle, sont tenues de fournir continuellement des produits logiciels de haute qualité aux utilisateurs. Récemment, la communauté de génie logiciel a commencée à adapter plusieurs concepts des tests de logiciels traditionnels, tels que les tests par mutation, afin d’améliorer la fiabilité des systèmes logiciels basés sur l’apprentissage automatique. De plus, pour faciliter le processus de livraison de ces systèmes, les éditeurs de logiciels proposent de nouveaux changements dans leur processus de mise en production qui s’adaptent aux nouvelles technologies telles que le dé- ploiement continu et l’Infrastructure-as-Code. Cependant, les ingénieurs en charge de la mise en production de ces logiciels éprouvent des difficiles d’implémentation de ces technologies et ont recours à des sites Web de questions et réponses tels que StackOverflow pour trouver des réponses. Concernant la qualité des systèmes logiciels basés sur l’apprentissage automatique, il n’est pas clair si les techniques de test proposées dans les travaux de recherche académiques sont adoptées en pratique. De plus, il existe très peu d’information sur les stratégies de test employés par les ingénieurs des systèmes logiciels basés sur l’apprentissage automatique.---------- ABSTRACT : We are witnessing an increasing adoption of machine learning (ML) and deep learning (DL) algorithms in many software systems, including safety-critical systems such as health care systems or autonomous driving vehicles. On the one hand, ensuring the software quality of these systems is yet an open challenge for the research community, mainly due to the inductive nature of ML software system. But, on the other hand, the ML and the Release engineering teams are continuously required to deliver high-quality ML software products to the end-user. Few recent research advances in the quality assurance of ML systems have been adapting different concepts from traditional software testing, such as mutation testing, to help im- prove the reliability of ML software systems. Also, to assist in the delivery process of these systems, modern ML software companies are proposing new changes in their delivery process that adapt to new technologies such as continuous deployment and Infrastructure-as-Code. However, the ML and release engineers still find these practices challenging and resort to question and answer websites such as StackOverflow to find answers. For the ML software quality, it is unclear if any of the proposed testing techniques from research are adopted in practice. Moreover, there is little empirical evidence about the testing strategies of ML engineers. Software testing and release engineering together are important for the efficient delivery of reliable ML applications

    Clone-based variability management in the android ecosystem

    No full text
    Mobile app developers often need to create variants to account for different customer segments, payment models or functionalities. A common strategy is to clone (or fork) an existing app and then adapt it to new requirements. This form of reuse has been enhanced with the advent of social-coding platforms such as Github, cultivating a more systematic reuse. Different facilities, such as forks, pull requests, and cross-project traceability support clone-based development. Unfortunately, even though, many apps are known to be maintained in many variants, little is known about how practitioners manage variants of mobile apps. We present a study that explores clone-based reuse practices for open-source Android apps. We identified and analyzed families of apps that are maintained together and that exist both on the official app store (Google Play) as well as on Github, allowing us to analyze reuse practices in depth. We mined both repositories to identify app families and to study their characteristics, including their variabilities as well as code-propagation practices and maintainer relationships. We found that, indeed, app families exist and that forked app variants fall into the following categories: (i) re-branding and simple customizations, (ii) feature extension, (iii) supporting of the mainline app, and (iv) implementation of different, but related features. Other notable characteristic of the app families we discovered include: (i) 72.7% of the app families did not perform any form of code propagation, and (ii) 74% of the app families we studied do not have common maintainers

    Studying the Practices of Deploying Machine Learning Projects on Docker

    No full text
    corecore